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THE DENSE K SUBGRAPH PROBLEM 



Abstract. Given a graph G = (V, E) and a parameter fc, we consider the problem of 
finding a subset U C V of size fc that maximizes the number of induced edges (DfcS). We 
improve upon the previously best known approximation ratio for DfcS, a ratio that has not 
seen any progress during the last decade. Specifically, we improve the approximation ratio 
from n - 32258 to n ' 3159 . The improved ratio is obtained by studying a variant to the DfcS 
problem in which one considers the problem of finding a subset U C V of size at most k 
that maximizes the number of induced edges. Finally, we study the DfcS variant in which 
one considers the problem of finding a subset U C V of size at least k that maximizes the 
number of induced edges. 
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1. Introduction 

In this thesis, we consider the Densest k Subgraph problem (DA;S). For a given undirected 
graph instance G(E, V), in the DkS problem one is to find a subgraph with exactly k vertices 
with a maximum number of induced edges. In addition, we also consider two variants of DfcS. 
The Densest-at-least-/c-Subgraph problem and the Densest-at-most-fc-Subgraph problem 
(both defined below). We present a number of results for the problems at hand. Most 
notably, we improve upon the previously best known approximation ratio for DfcS, a ratio 
that has not seen any progress during the last decade. 

1.1. Definitions. 

Definition 1.1 (Densest-/c-Subgraph). Given an undirected graph G(V,E) the Densest- 
fc-Subgraph (DfcS) problem on G is the problem of finding a subset U C V of vertices 
of size k with the maximum induced average degree. The average degree of the optimal 
subgraph will be denoted as d* = 2\E(U)\/k. Here \E(U)\ denotes the number of edges in 
the subgraph induced by U . 

Definition 1.2 (Densest-at-least-fc-Subgraph). Given an undirected graph G(V,E) the 
Densest-at-least-fc-Subgraph (Dal/cS) problem on G is the problem of finding a subset U C V 
of vertices of size at least k with the maximum induced average degree as defined in the 
DkS problem. 

Definition 1.3 (Densest-at-most-Zc-Subgraph). Given an undirected graph G(V, E) the 
Densest-at-most-/c-Subgraph (Dam/cS) problem on G is the problem of finding a subset 
U C V of vertices of size at most k with the maximum induced average degree (as defined 
in the DkS problem). 

An a > 1 approximation algorithm for these problems is an algorithm that given G 
returns a subset of vertices U of size k* such that 2\E(U)\/k* > d* /a where k* = k for 
DkS, k* > k for Dal/cS and k* < k for Dam/cS. Here, d* is the average degree of the optimal 
subgraph on each of the problems respectively. 

1.2. Previous work. The DkS problem is NP-hard to solve exactly (a fact easily seen by 
a reduction from the Max-Clique problem). The current best approximation ratio known 
for the DkS problem is n s for some S < | [5], where n is the number of vertices in the input 
graph. This result was obtained using a combinatorial algorithm. To be more precise, the 
algorithm of [5] is actually combined from five different combinatorial algorithms, each of 
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the algorithms gives good results on different instances of the problem. The first algorithm 
is a trivial one which always returns a subgraph with average degree of value at least 1. 
The second algorithm is greedy and performs well when the input graph has several vertices 
with high average degree relative to n. The third algorithm is also greedy, and gives good 
results when d* , the optimum value of the D/cS instance, is high with respect to its size k. 
The final two algorithms are tailor made to fit specific relations between d* and k. 

In [5] it is shown that it suffices to consider the first three algorithms to obtain a ratio of 
?i3. Improving the ratio to n 1 / 3-6 , for some constant e > 0, is obtained by combining the 
two additional algorithms that are designed to take care of the special instances for which 
the first three algorithms indeed achieve a ratio no better than n 1 / 3 . The exact value of 
e achieved in [5] it not stated explicitly, rather it is only shown to be constant and thus 
independent of n. Nevertheless, since the work of [5] (over a decade ago), there has been 
no improvement in the approximation ratio of the DfcS problem. In this thesis, we present 
a detailed analysis of the ratio obtained by [5] (Section [5]), and improve on this ratio by 
replacing the fifth algorithm of |5] with a new one (Section [6|). 

The D/cS problem was also studied by Feige and Seltser [7] where it is shown to be NP- 
complete even when restricted to bipartite graphs of maximum degree 3 (use a reduction 
from the Max-Clique problem). In a similar way Asahiro, Hassin, and Iwama [2] have 
showed that the problem remains NP-complete in very sparse graphs where d = k e . 

Khot [9] has shown there can be no PTAS solution for the densest k-subgraph problem, 
under the assumption that the family NP does not have randomized algorithms that run in 
time 2" e for some constant e > 0. 

Two additional problems studied in this thesis, that are closely related to DkS and first 
appear in the work of Anderson pQ, are the Dal/cS and Dam/cS problems (Definitions 1 1 . 21 and 
11.31 respectively). For the DalfcS problem, Anderson presents an approximation algorithm 
with a ratio of 2. The question whether Dal/cS is NP-hard or not was not resolved in [1]. 
This is not the case for Dam/cS, as in pQ, Anderson proves that DamA;S is NP hard (by a 
simple reduction to the Max-Clique problem). Moreover, he presents a connection between 
approximating Dam/cS and DkS. Namely, if there exists a polynomial time algorithm that 
approximates DamfcS in a weak sense, returning a set of at most /3k vertices with average 
degree at least 1/7 times the average degree of the densest subgraph on at most k vertices, 
then there exists a polynomial time approximation algorithm for DkS with ratio 4(7 2 +7/3). 

Another problem closely related to the DkS problem is the Densest Subgraph (DS) prob- 
lem. The DS Problem concerns choosing a subset U (regardless of its size) as to maximize 
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the average degree of the subgraph induced bu U. The DS problem can be solved in poly- 
nomial time using either LP based techniques [3] or by flow techniques [8] . 

We note that a recent and independent work [10] presents results similar to ours regarding 
the Dal/cS and the Dam/cS problems. Moreover, in the recent and independent work [3] a 
better overall approximation guarantee of n 1//4+e is given for the D/cS problem. The running 
time in [3] depends on e, and to obtain a ratio of n 1 ' 4 a total running time of n ''" 9 "' is 
needed. 

1.3. Our results. In this thesis, we present several results regarding the DA;S and the 
closely related problems of DamA;S and Dal/cS. 

Our main result is the design and analysis of a new algorithm to be used in combination 
with the first four algorithms of [5] (replacing the fifth algorithm of [5]). We show that 
adding our algorithm one is able to obtain an approximation ratio of n 0,3159 for D/cS. To 
compare this with the ratio of [5] , we first study the original algorithms of [5] and show that 
their combination results in an approximation ratio of n - 32258 = n 1 ^~ t (thus computing 
the value of e unspecified in [5]). Our new algorithm is based on linear programming (LP), 
and is designed to improve on the quality of the algorithms of [5] on certain D/cS instances. 
The exact ratio (or to be more precise an upper bound on the ratio) of the combined five 
algorithms is obtained using a simple C-program. 

The principle of our new algorithm is to guess d* (the optimum average degree) , guess a 
vertex v inside an optimal solution (an optimal subset U), and then run an LP that takes 
d* and v into account. In our LP, a suitable D/cS solution corresponds to a feasible LP 
solution, but the other direction does not necessarily hold. Thus, the fractional LP solution 
is rounded to get a feasible solution U to DkS. The resulting approximation ratio obtained 
depends on several parameters of the instance at hand. Specifically, the ratio depends on 
k, d*, and dn the maximum degree in G and is equal to tq = ( ^ 1 ■ As mentioned, 
integrating our new algorithm with the four of [5] (replacing the fifth one with ours) we 
obtain the improved ratio. 

Our other results were motivated by the work of Anderson pQ. In [JJ the DalfcS problem 
was presented (see Definition 1 1 . 2[> . but the NP-hardness of this problem was left open. We 
have managed to show, using a reduction from the DfcS problem, that Dal/cS is also NP- 
hard. Our proof is based on taking a DkS instance and adding to it a big clique, making it an 
instance for the DalfcS problem. Also in [1], a 2-approximation was given to solve the Dal/cS 
problem. We have used [TT] to show a more general approach for solving this problem based 
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on optimizing supermodular functions. In [I] it was shown that if Dam/cS (see Definition 
II. 3p has a 7-approximation agorithm, then the algorithm can be used to approximate D/cS 
within a ratio of A^y 2 . We reduce the latter to 47. Our improved reduction is iterative and 
is strongly based on the fact that in each iteration we only remove edges from the original 
graph G if they are included in the final D/cS solution. 

1.4. Thesis structure. Our work includes the following sections. First we prove that 
Dal/cS is NP-complete (Section [2]). This resolves the question left open in pQ. We proceed 
to present a general paradigm that enables to obtain a 2 approximation algorithm for 
Dal/cS in polynomial time (Section [3]). This complements the 2-approximation algorithm 
presented in [1] . Turning to the Dam/cS problem, in Section H] we show a strong connection 
between the ability to approximate Dam/cS and D/cS. Namely, improving on the connections 
discribed in [1] we show that any ratio achievable on Dam/cS is also (up to constant factors) 
achievable on D/cS. In Section [5l we study the D/cS problem, and present a full analysis 
of the approximation ratio of [5]. Namely, we compute the value of e left unresolved in 
[5]. In Section [6] we present our new algorithm for D/cS. This section strongly builds on 
the previous ones. Finally, in Section [7] we present the numerical techniques (a C-program) 
used to determine the ratios stated throughout our work. We also rigorously bound any 
slackness that may arise from these techniques. 

2. The Dal/cS problem is NP hard 

In his work [T], Anderson gave a 2-approximation algorithms for the Dal/cS problem and 
mentioned that he doesn't know if this problem is NP hard or not. We have found that this 
problem is NP hard. 

Theorem 2.1. Finding the densest subgraph with at least k vertices is NP hard. 

Proof. In this proof, for a graph H, the term density will refer to the average degree of 
H. Let G(V, E) be an instance to the D/cS problem. Let \V\ = n. Let G' be the graph 
consisting of G and a clique of size 3n: K% n . Namely, the vertex set of G' consists of the 
vertices of G and an additional 3n new vertices; and the edge set of G' consists of the edges 
of the original graph G and the edges of a complete graph on the new 3n vertices. Let 
k' = k + 3n. Let H be the optimal subset in G' with respect to the Dal/cS problem with 
parameter k'. We claim that H will consist of exactly the densest subgraph in G of size k 
and the clique K^ n . This will suffice to prove our claim (as the D/cS problem is NP-hard). 



First we show that K^ n C H. Suppose that the size of H is 61 + 62 where b\ denotes the 
number of vertices in H taken from the clique K^ n and 62 denotes the number of vertices 
in H taken from the original graph G. For the sake of contradiction assume 61 < 3n. We 
will show that we can take 3n — b\ vertices out of the 62 vertices in G and select instead 
3n — b\ vertices in K% n making the clique complete. By doing so we will increase the number 
of edges in H without increasing the number of vertices, hence increasing the density and 
contradicting the optimality of the solution. 

So lets show that the number of edges increases. First we give names to the different 
groups of vertices. Group Bl will be the b\ vertices of K^ n defined above, B2 will be the 62 
vertices of G defined above, Tl will be the group of new selected vertices in K% n (namely 
Tl = K^n \ Bl) , T2 will be the 3n — b\ vertices we removed from B2, and R2 = B2 \ T2. 
We need to show that \E(Tl)\ + \6(T1,B1)\ > \E{T2)\ + \6(T2,R2)\. Here 5(A,B) refers 
to the edges crossing between vertex sets A and B. Notice that b\ + 62 > 3n. 

Since Tl is a subset of the clique K 3n and |T1| = \T2\ it holds that \E(T1)\ > \E(T2)\. 
Next we want to show that |5(T1,1?1)| > \5{T2, R2)\. Tl and Bl are subsets of the clique 
Ks n , thus there are edges between all the vertices in Tl and all the vertices in -Bl. Since Tl 
has the same size as T2, the only way \5(T2, R2)\ will be greater or equal to |5(T1, Bl) \ is if 
|i?2| is greater or equal to \B1\. We now show that this cannot happen, namely \B1\ > |i?2|. 
|i?l| = 3n — \T1\ > n — \T2\ > \B2\ — \T2\ = \R2\. Thus, we deduce that every optimal 
solution H must include the clique K% n . 

Next we will show that exactly k vertices will be selected from G. Recall that the number 
of vertices chosen from G is denoted b^- 62 must be at least k = k' — 3n. This follows from 
the fact that the algorithm returns a subgraph H of G' of size at least k' which includes 
the clique K% n and additional vertices from G. Now we show that selecting more then k! 
vertices from G will contradict the optimality of the solution. 

Suppose for the sake of contradiction that the solution H has 62 vertices from G where 
62 > k. Let d be the average degree of the subgraph of G induced by these 62 vertices. The 
average degree of the resulting graph d will be d = • If we remove the vertex of 

minimal degree in HPiG we will loose at most d edges (or, at most 2d in the sum of degrees), 
hence getting a new average degree d > ^ n ^ n ^^^}^~^ ■ This new density is bigger than 
d as can be seen from the following calculation, d — d = ^ n ^ n ^+^^ 2 — 3n ^~_^l^ db2 = 
fegy^TO > 0. As, d < n and b 2 < n, it holds that d-d> > °" The 

last inequality stands for n > 1. So we found a way to improve the density by removing 
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vertices from H. This implies that b<i must be as small as possible. Nevertheless the demand 
is that it will be at least k', so we deduce hi = k. 

Finally, the k vertices from G that will be selected for H will contribute the most when 
they consist of the densest subgraph of G with size k. Let d be the average degree of the k 
vertices selected from G. Let d* > d be the average degree of the densest subgraph of size 
k in G. As d = 3n ( 3n k P +kd < 3n ( 3 " ^ +kd . We gain from taking the densest subgraph of 
size k in G. 

We conclude that the k vertices selected from G must be the DfcS solution. Since DfcS is 
NP hard we conclude our assertion. □ 

3. A 2 APPROXIMATION ALGORITHM FOR THE DALfcS PROBLEM 

In his work p], Anderson presented a 2 approximation for the DalfcS problem. We present 
an alternative proof that holds for a family of problems which includes DalfcS. Our algorithm 
is based on the algorithm for the Dense-Subgraph problem presented in 

Let G(V, E) be an undirected graph and let S be any subset of V . Denote by E(S) the 
edges induced by the subset S. 

Lemma 3.1. Let q > 0, we can solve the problem of maximizing \E(S)\ — q\S\ exactly in 
polynomial time. 

Proof. Let U be a ground set, and let A and Y be subsets of U. A set function / is 
supermodular if 

f(X) + f(Y)<f(XnY) + f(XUY) \/X,YCU 
A set function p is submodular if 

P {X)+p(Y)>p{XnY)+p{XUY) VX,YCU 

If f(X) is supermodular and p(X) is submodular, then for q > the function f(X) — 
qp{X) is supermodular. The problem of maximizing a supermodular function can be solved 
(under certain rationality restrictions) in polynomial time (e.g., see 

For G(V,E), let X and Y be subsets of V. We now show that the function of 7(A) = 
l-E^A)! is supermodular, and the function S(X) = \X\ is submodular. This follows by 
counting the contribution of edges to the sides in the definition of / above. Edges in 
E{X — Y) and E(Y — X) are counted once in both sides, while edges in E{X n Y) are 
counted twice. Edges between X — Y and Y — X are counted only on the right hand side. 
This proves our claim for 7(A). The proof for 5(X) is straightforward. □ 
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Maximizing j-E^S")! — q\S\ for any q > can aid us in finding a 2 approximation for Dal/cS. 
Let us define G* (S* , E*) as the densest subgraph with at least k vertices, and its average 
degree is d* . 

Namely, \E*\ = d*\S*\/2. For q = d* /4, let S be the subset maximizing j-E^S*)! — q\S\ and 
let = d\S\/2. We show that S will imply our approximate solution. It holds that: 

\E(S)\ - !j\S\ > \E{S*)\ - !j\S*\ = \E*\ - ^-\S*\ = \E*\/2 

If \S\ > \S*\ from the fact that \E(S)\ - f \S\ > we get d\S\/2 = \E(S)\ > ^\S\ which 
implies that d > d*/2. 

If \S\ < \S*\ we add arbitrary vertices to S until it is of size k. Let E' be the edge set 
induced by the enlarged set S and let d' = 2\E'\/k be its average degree. 
We have: 

d! = 2\E'\/k > 2\E(S)\/k > \E*\/k > \E*\/\S*\ > d* /2 

The third inequality follows from the fact that l-E^S")! — > \E*\/2. We thus achieve a 
2 approximation no matter what the size of S is. 

Remark 3.2. Since we don't know d* , we can't compute q in advance. We can try to guess 
it in different ways. The naive approach will be to exhaust all the possibilities. Since 
d* = 2\E*\/\S*\ and \E*\ € {0, 1, \E\} and \S*\ G {0, 1, \S\} then we can bound the 
number of possibilities for d* by with 

4. The Dam/cS and D/cS problems 

Definition 4.1. An algorithm A(G,k) is a (/3, 7)-approximation algorithm for the Dam/cS 
problem if for input graph G and size k it returns a solution with at most /3k vertices and 
an average degree of at least dam(G,k)/'j, where dam(G,k) is the optimal average degree 
of the Dam/cS problem on G. 

Anderson, in his work [T], has shown the following: If there is a (/?, 7)-approximation 
algorithm for Dam/cS (with /3 and 7 greater or equal than 1), then there is a 4(7 2 + 7/3)- 
approximation algorithm for DkS. For the specific case where f3 = 1 this gives a 47 2 - 
approximation ratio. We significantly improve upon this result of pQ: 

Theorem 4.2. // there is a 7- approximation algorithm for DamkS (with 7 greater or equal 
than 1) then there is a 47- approximation algorithm for DkS. 
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Proof. We specify our algorithm for DkS. As done in several places before in this thesis, we 
assume that the value d* is known. 

Algorithm 4.3. [Solve DA;S using Dam/cS] 

Let G(V, E) and k be an input instance to the DkS problem. Let S be an empty group of 
vertices. 

a) Using the approximation algorithm for DamkS on G, find an approximate solution S' 
with at most k vertices. 

b) Add the vertices of S' and its induced edges to S. Remove the edges induced by S' from 
G. 

c) If the number of edges in S is E(S) < \kd* and \S\ < k we go back to (a) and continue. 

d) If \S\ < k we add to it an arbitrary set of k — \S\ additional vertices out of G. 

e) Otherwise for \S\ > k. We greedily remove the lowest degree vertices from S until it is 
of size k. 

f) Return S. 

We now analyze the suggested algorithm: 

Denote an optimal subset to the D/cS problem of size k by S* , and its edge set by E* . 
It holds that d*k = 2\E*\ for the optimal average degree d* . If at the end of our algorithm 
S includes at least ^|^*| edges of E then it holds that |1£(<S)| > ^kd*. As \S\ = k this 
implies an approximation ratio of 47. 

We will prove that each iteration of (a) picks vertices with average degree of value at 
least ^d*. In the first time we are at (a) the graph G is still the original graph and so is 
the optimal subset S*, so the Dam/cS algorithm can pick a subgraph smaller or equal to k 
with average degree equal or higher then d* /j. 

Lemma 4.4. At any other iteration, one of the two must exist: either the vertices of S* in 
G still have average degree higher than ^d* or the set S satisfies \E(S)\ > jkd* . 

Proof. Suppose the second condition does not hold, then l-E^S*)! < \kd* . This means that 
S includes less then half of the edges that were originally induced by 5*. It follows that 
there are yet more then \kd* edges between the vertices of S*. So the set S* in G has at 
most k vertices while having at least \kd* edges, namely its average degree is higher then 
\d\ □ 

Notice that Lemma 14.41 implies that each time we do not pass from step (c) of the 
algorithm to step (d) (and rater return to (a)), the set S' will have average degree at least 
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^-d*. This follows from the fact that there is still a subset (the subset S*) in G that has 
size at most k and average degree at least \d* . This implies that at each visit in step (c) 
the set S has average degree at least ^d*. 

By the time we pass step (c) one of the two following cases must hold. Either, |5| < k 
and thus it must hold that their are at least \kd* edges in S. In this case phase (d) of 
the algorithm suffices to obtain the desired set S. Or we are in the case in which \S\ > k. 
In this case, it holds that \S\ < 2k — 1 (as in the previous iteration S was smaller than k, 
and in each iteration at most k vertices are added to S). Moreover, as in each iteration the 
average degree of S' added to S is at least ^d*, the average degree in S is also at least 
Tpd*. Thus by Lemma 14.51 (appearing below), after we remove vertices from S in step (e) 
we remain with a set S of average degree at least ^-d*. 

The following lemma that is needed for the completeness of this proof is taken from [6] 
(Lemma C.l). We have added it here without its proof. 

Lemma 4.5. (Fixing Lemma) Given a set U of size \U\ > k and weight W we can efficiently 
find a subset C.U of size k and weight at least W r^7^r~]i • 

In the above, the term weight refers to the weight given to each edge in the graph and 
W refers to the sum of these weights. In an unweighted graph, every edge has a weight of 
1 and W translates to the number of edges. 

Remark 4.6. Theorem 14.21 states a connection between the DamfcS and D/cS problems. 
Namely, a 7 approximation algorithm for the former yields a 47 approximation for the 
latter. In the remainder of this thesis we will use Theorem l4.2l a few times when 7 = ^i{d* arn ) 
is a monotone decreasing function of d* am - the average degree of the optimal solution to the 
DamfcS problem. For an optimal value d* to the D/cS problem, Lemma 14.41 above implies 
that during any execution of step (a) in Algorithm 14.31 anv subgraph G considered will 
have a subgraph of size at most k with average degree at least d*/2. Note that this implies 
that the optimal solution to DamfcS on G will also have average degree at least d*/2. This 
implies, following the analysis above, that we can promise an approximation ratio for D/cS 
of at most 4r/(d*/2). The dependence of 7 on d* that we will use in the upcoming sections 
is polynomial, thus in these cases we obtain a slightly weaker reduction between Dam/cS 
and DkS, however we stress that the ratio between the approximation of the former and 
latter remains constant in these cases. 
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5. The previously best known ratio for DkS 

As mentioned previously the currently known best approximation ratio for the Dense-£> 
Subgraph problem stands on for a constant 5 slightly less than 1/3, [5]. The algorithm 
presented in [5] is composed of 5 algorithms A\, . . . ,A§. Computing the approximation 
ratio for each of these algorithms, [5| are able to prove that 5 < 1/3 — e for some e > 0. 
However, the analysis in [5] does not make an attempt to find the precise value of 5. In 
what follows we calculate 5. The bound, is based on the analysis of [5] and is done in two 
steps. First, we revisit the 5 algorithms appearing in [3] and (by refining the analysis of [S]) 
we present their approximation ratio as a function of k, d*, and djj- Here, as in [5], d* is 
the average degree of the optimal solution to the problem at hand, and dn is the average 
degree of the highest k/2 degrees in the graph. A full analysis as described above appears 
in [5] for the first three algorithms, so here we just state their results. For algorithms A4 
and A§ our analysis is new. Secondly, after we have determined the approximation ratio for 
all five algorithms, denoted rt(k,d* ,dn) for algorithm Aj, we run a simple C-program that 
computes (a lower bound) to 

max min n(k,d* ,djj). 

k,d* ,dn i=l,...,5 

Specifically, our C-program (given in Section ITTTj) computes min^i^..^ Ti(k, d*, djj) on a 
large set of triplets (k,d* ,dn)- Taking the maximum value of min^i,...^ ri(k, d*, dn) over 
the triplets considered yields the desired lower bound on 5. 

Before we state and prove the main theorem and lemmas of this section, a few remarks 
are in place. In [5], five algorithms were considered, and for each such algorithm Ai an 
approximation ratio rj was determined. As stated above, we follow the proof of [Jj and 
present an enhanced analysis for n. Using this analysis, we compute the approximation 
ratio obtained via combining these five algorithms. Given that our understanding of [S] is 
precise (we have made every effort to justify this assumption), our analysis highlights the 
limitation of the proof in [5] and yields Theorem 15 . 1 1 stated below. Namely, in our analysis, 
we present triplets (k,d*,dj{) for which the enhanced analysis of [5] does not promise an 
approximation ratio better than n - 32258 . "We stress that our claim is on the analysis of [5] 
alone and not on the actual performance of their combined algorithm - which potentially 
could do better than the analysis shows. Throughout this section, when we state that an 
approximation ratio rj is equal to some expression, we mean that based on our enhanced 
analysis a ratio of rj can be obtained, and our analysis does not promise any ratio better 
than n. 
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Theorem 5.1. Our enhanced analysis to the algorithms presented in [5] promise an ap- 
proximation ratio no lower than 77, - 32258 . 

5.1. Algorithms A\, A2, and A3. 

Lemma 5.2 ([5J. The approximation ratio of Algorithm A\ from [5] is r±(k,d* ,djj) = d* . 

Lemma 5.3 ([5]). The approximation ratio of Algorithm A2 from [5] is r2(k, d*, djj) = 

In [5], algorithm A2 was used in more then one way. The following lemma (essentially 
proven in [5]) makes use of A2 and is needed later. 

Lemma 5.4. Let G(V, E) be a given graph. Let dn be the average degree of the k/2 highest 
degree vertices in G. One can efficiently find a subgraph G' of G of maximum degree dn 
such that either (a) Running algorithm A2 on G yields an approximation ratio of 3, or (b) 
The optimal value of the DkS problem on G' is at least a third of the optimal value on G. 

Proof. Based on Lemma 3.3 in [5], removing the k/2 vertices with highest degree from G 
reduces the optimum solution d* to d' > d* — 2d* jr<i where d*/r2 is the average degree 
yielded by the second algorithm of [5]. So either d* /r2 > d* /3 or d' > d* /3 and the lemma 
follows. □ 

Since a constant lost of proximity is not of our concern in this thesis, we will refer from 
this point on to du as the highest degree in the graph. 

Lemma 5.5 (|5j). The approximation ratio of Algorithm A3 from [5] is r^(k, d* , dji) = 

2max{k,du ) 

5.2. Algorithm A4. 

Lemma 5.6 ([S]). The approximation ratio of Algorithm A^ from is r^ik^d* ,dn) = 

2k 2 (2d H ) 1 / 3 

Algorithm A± in [S] activates algorithms Ai, A2, and ^3 on a subgraph G' of G of size 
n' < 2du- By the analysis in [S] (Lemma 4.4 of [S]), G' is promised to have a subgraph of size 
k and average degree at least d' > (d*) 3 /k 2 . As it is shown in [S] that min(ri,r2, r^) < 2n 1//3 , 
we conclude that ri {k,d*,d H ) < d*2{ri) l l z /d' < 2fc2( ff 1/3 . 
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5.3. Algorithm A5. 

Lemma 5.7 ([5])' The approximation ratio of Algorithm A§ from [5] is r$(k,d* ,djj) = 
9(max( fc , k 1 d f"-i 1 )) f or the case where d 2 H < k and r$(k, d* , dn) = 6(max( k d ji H , ^73)) 
where d 2 H > k > du 

In [5], algorithm A§ was analyzed for a very specific set of parameters k, d* , and dn- We 
now extend the analysis of [5] to obtain the ratio r$ which is a parametric version of the 
ratio obtained in In what follows we revisit Procedure 5 (walks of length 5) from [5]. 
First let us recall a lemma presented and proved in [5]. An l-walk between u and v is a 
path of length I edges between u and v. 

Lemma 5.8 (|_5j). For a graph with size k, average degree d and a number I, there must exist 
at least two vertices u and v for which the number of l-walks from u to v is W[[u, v] > d l jk. 

We use the above lemma with Z = 5. Namely, from the existence of an optimal subgraph 
G* of size k and average degree d* we infer that there must exist at least two vertices u and 
v in the graph for which Ws[u, v] > (d*) 5 jk. 

We will use this observation in order to calculate the ratio in this case. Let us define 
N\,N2, -/V3, N4 to describe the subsets of vertices that are first, second, third, and fourth on 
the paths of length 5 between u and v. As stated in [5], it may be the case that a vertex 
appears in more than a single set. This may cause some edges in the analysis below to 
be counted several times - but the multiplicity remains constant, and does not effect the 
analysis beyond a constant factor. Thus we assume in our analysis that the subsets are 
disjoint. For completeness, we rewrite the second part of Section 4.2 in [5] with intention 
to extract the function r^. 

Proof of Lemma \5. 1\ Let e(d*,dn,k) be a function to be determined later in the proof. 
In what follows we will show how to find in G a subgraph G' of size 0(k) with average 
degree 0(e). Using Lemma 15.41 and Theorem 14.21 this implies an approximation ratio for 
the DfcS problem of 0{d* /e). We refer the reader to Remark 14.61 an d note that dn and k 
do not change during Algorithm 14.31 Our proof is done by a detailed case analysis based 
on that presented in [5]: Our calculation is seperated into two main cases: d 2 H < k and 
d\ > k > du- 

Case 1: cP H < k. Assume that cut(N2, N%) > edu 2 ■ Here cut(A,B) is the number of edges 
with one end point in A and another in B. In this case, as the size of ^2 and ./V3 is bounded 
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by dn 2 < k, we take G' to be the graph induced by N2 U N3. It holds that G' has average 
degree e/2. We thus continue under the assumption that cut(N2, N%) < edn 2 - 

Now assume that there exists a w £ N2 such that Ws(w,v) > due. Observe that all the 
length 3 walks between w and v must pass through N3 and N4. Consider the graph induced 
by the neighbors of w in JV3, and the set N4. Since w has at most dn neighbors in JV3, 
this graph contains at most 2dn vertices, and f2(dffe) edges. Implying an average degree of 
Q(e). We thus continue under the assumption that for every w £ N%, Ws(w,v) < due and 
for every w € N3, W^(u, w) < due (here we use the symmetry of these two assumptions). 

We now show, in the setting at hand, that we may also assume that every edge between 
N2 and N$ lies in at least ^ ^ walks from u to v (without loosing much). Indeed, remove 
any edge between N2 and N3 that lies in less then ^ length 5 walks from u to v. Since 

ACL jj €.fi 

the number of edges between N2 an d N3 is less then d 2 H e, (by our first assumption) we will 
disconnect at most ( 2 ^ efc )^g e = %fc wan< s. So we remain with fi(^jg-) walks in W$(u,v). 

Let e = (w,z) be an arbitrary edge between w € -/V2 and z £ N3. By our asumptions, 
e lies in p > a walks from u to v. Clearly p < deg(w, Ni)deg(z, N4). Thus, either 



deg(w,Ni) > J 2 ~p efc or deg{z,Ni) > J 2% ek ' I 1 " the first one is true, we say w is a 'good' 



|^or deg{z,Nt)>y/^ 
vertex, otherwise z is the 'good' one. Now we activate the following procedure: 



1) we choose for every edge e, its good vertex (denoted by w). 

2) we put that vertex in S2 or S3 respectively if it comes from N2 or N3. 

3) we remove all the edges that touch w from the graph. 

Let w be a good vertex in N2 (a similar analysis holds for good vertices in N3). By our 
second assumption, Ws(w,v) < dn£- Observe that in step 3 above, we only discard the 
length 5 walks from u to v that go through w. The number of walks between u and w 
(which equals degree(w, iVi)) is bounded above by du- The number of walks between u 
and v passing through w is thus bounded by dnedn = d 2 H e. 

Since we have fi(nc-) walks between u and v, and each iteration removes at the most 
d 2 H e of them, we can fix the number of iterations to be Jp ek for a constant c > 1. Thus the 
total number of 'good' vertices, found by the algorithm is Jjz ek - 

W.l.o.g assume that IS2I > Now consider the subgraph induced by S^U-^i- This 
subgraph contains 0(dn + ek ) vertices out of which £fc vertices have degree at least 



2 Js* ek ■ Thus the average degree of this subgraph is at least 



It follows from basic computations that for a constant d , setting e to be 
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^*5/3 

d • min( - _ - , „ , — ■ ^75-)) 



the above average degree is at least e. Notice also that since we assumed k > dn 2 then 
1 5*2 1 < rf_ff 2 < namely, the size of the group we selected is at most k. 

All in all, in all the sub-cases specified above we obtain a subgraph G' of size at most k 
with average degree f2(e). 

Case 2: d 2 H > k > d H . Assume that cut(N 2 ,N 3 ) > We can construct a subset 

of (expected) size at most k by picking each vertex in A2 [j A3 with probability k/2dn 2 . 
Thus, using Lemma 14.5} we may obtain a subgraph with at most k vertices and fce/4 edges. 
We thus continue under the assumption that cut(N2, A3) < dn k e . 

Now assume that there exists w G A2 with Wz(w,v) > dfl-e. Observe that all the length 
3 walks between w and v must pass through A3 and A4. Consider the graph induced by the 
neighbors of w in A3, and the set A4. Since w has at most dn neighbors in A3, this graph 
contains at most 2djj + 1 vertices, and fi(d#e) edges. Implying a subgraph of size less than 
or equal to k with an average degree of 0(e). We thus continue under the assumption that 
for every w G N 2 , W^w, v) < du^ and for every w G A3, Ws(u, w) < due (here we use the 
symmetry of these two assumptions). 

We now show, in the setting at hand, that we may also assume that every edge between 
A2 and A3 lies in at least ) walks from v to u. 

Indeed, remove any edge between N2 and A3 that lies in less then ^4 length 5 walks 
from v to u. Since the number of edges between A2 and A3 is less then -4f-, (by our first 



assumption) we will disconnect at most (^ja— )-jjf- = wa lks m Ws(u,v), so we remain 
with fi(^) walks (recall that W 5 (u,v) > ^). 

Let e = (w,z) be an arbitrary edge between w € JV2 and z G A3. By our assumptions, 
e lies in p > -^r~ e walks from u to v. Clearly p < deg(w, N\)deg(z, A4). Thus, either 
deg(w,N\) > ^J-^r^ or deg{z,Ni) > \r^jz~ e - ^ the first one is true, we say w is a 'good' 
vertex, otherwise 2 is the 'good' one. Now we use the same analysis and procedure as in 
the corresponding case 1. Namely, we construct the sets S2 and S3. As before, for every 
w G N2, hy our assumption above, Ws(w,v) < due. Moreover, in our procedure we only 
discard the length 5 walks from wtoa that go through w. Since W%{w, v) < due and since 
the number of walks between u and w (which equals deg(w,N\j) is bounded above by du, 
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it implies that the number of walks between u and v passing through w, is bounded by 
d H ed H = d 2 H e. 

Since we have £l(d* 5 /k) walks between v and u, and each iteration removes only d\e 
of them, the number of iterations can be at least ek )- However, if this number is 
greater than k, we will stop after k iterations only. We proceed under the assumption that 
8( js efc ) < k, i.e. the total number of 'good' vertices, found by the procedure is 6( J efc ). 
We deal with the case that Q( rf a > k at the end of the proof. 

W.l.o.g assume that \S2\ > l-S^I- Now consider the subgraph induces by S^U-^i- ^ 
contains 0(dn + % . ) vertices out of which 0(- 



vertices have degree at least 



d* 5 



Thus the average degree of this subgraph is at least 

9. 



V 



d H + 



(d*) 5 \ 
J 



It follows from basic computations that for a constant c', setting e to be 



r2(min( 



d 



*3 



d *5/3 



^4/3 



)) 



the above average degree is at least e. 

In case ®( rf s ek ) > k we have that S2 is of size 0(k) (as we stopped the iterative process 
early). Thus, as k > dn, in this case we obtain an average degree e' of 

Ik, 



(d*) 5 \ ( 

> n 



~d\ 



dn + k 



V 



2k 



V 



n 



4/3 

H 



> e 



So limiting our subgraph to k does not change the overall ratio. 

So, all in all (in case 1 and 2 analyzed above), we have found a subgraph with size smaller 
or equal to k with average degree O(min( fc0 d 6rfl 6 , kl f 3 J 2/3 )) f° r the case where d? H < k and 



J7(min( 



d* 3 d* 5 / 3 



)) when > k > du- The size being smaller or equal to A; implies a 



solution for the Dam/cS problem. But as we have shown in Section [U there is a reduction 
between the the Dam/cS and the D/cS problem, with only a constant lost of approximation. 
Thus we use this calculated average degree to get the ratio r§ stated in Lemma 15.71 □ 



5.4. Proof ofTheorem l5.lt Combining algorithms A\, A5. Using Lemmas [572115. 71 
we may now compute a lower bound to the approximation ratio of the combined algorithm 
of [5]: maxfc^*^ minj = i j ... ) 5 rj(fc, d*, djj)- We do this numerically going over many triplets 
(k, d* ,dn) and computing minj = i ... 5 ri(k, d* ,dn) for each such triplet. This suffices to yield 
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the lower bound in Theorem 15.11 for the ratio of [5] . Our C-program which preforms the 
computation above is given in Section [7.11 and runs in precision 0.00001. In Section [6] we 
show that we can improve upon this ratio. 

6. Improving on the approximating of the BkS 

As we have shown in the previous section, prior to our work, the best approximation ratio 
for the Dense-fe-Subgraph problem stands on n s for a constant 5 > 0.32258. Improving upon 
this approximation ratio is a long standing open problem. 

In [5] a total of 5 algorithms are presented. The first three algorithms of [5] tend to work 
'well' in several cases. For example, when we work on very dense graphs, where du - the 
average degree of the k/2 vertices with the highest degree in G, is high relatively to n. Other 
cases include the case in which k itself is very large relatively to n or when d* the average 
degree of the optimal subgraph is high relatively to max(fc, dfj) and/or n. When each of 
these parameters are very small - less then ra 1//3 , the algorithm gets good approximation 
for trivial reasons. Two worst cases for these algorithms where isolated in [5], in which the 
analysis gave exactly an 0(n 1 / 3 ) approximation: 

(6.1) d*(G, k) = 0(n^ 3 ), k = ^(n 1 / 3 ), d H = 6(n 2 ^). 

(6.2) d*(G, k) = Bin 1 ' 3 ), k = 6>(n 2 / 3 ), d H = 6{n l l 3 ). 

where dn is the average degree of the k/2 vertices with the highest degree in G. To improve 
upon this ratio, [S] suggest two additional algorithms, A4 and A$, one for each isolated 
region presented above. The result is the approximation ratio stated in Theorem 15.11 In 
what follows we present yet an additional algorithm Aq tailored to improve the ratio on the 
configurations of (k,d*,dn) in which the algorithms of [5] work poorly. More specifically, 
our algorithm is designed to address the region in Equation 16.21 above. 

Analyzing the ratio of our additional algorithm Aq, we are able to prove the following 
theorem: 

Theorem 6.1. The DkS problem admits an approximation ratio of at most n - 3159 . 

The proof of Theorem 16.11 includes four steps. First, recall by Lemma 15.41 that one may 
assume w.l.o.g. that the input graph has maximum degree dn • Then we turn to present our 
algorithm Aq. We start by presenting in Section f6.il an algorithm for the Dam/cS problem, 
that does not return a dense subgraph of size k but rather of size at most k. Our algorithm 
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is based on a linear programming relaxation and assumes that the given input graph has 
bounded degree (which as discussed is w.l.o.g.). To obtain our algorithm Aq for D/cS, we use 
our analysis presented in Section H] in which we show that any algorithm for DamfcS implies 
an algorithm for D/cS with roughly the same approximation ratio. Finally, in Section [7] we 
analyze the approximation ratio of algorithms A±, A2, A3, A^ combined with our Aq. 

max min rAk, d* , cIh). 

k,d* ,d H i=l,-,4,6 

Here, as our calculations are numerical, we need to bound any error term that may arise 
due to the fact that we are not going over all possible triplets (k,d*,dn) but rather only 
checking a grid of triplets at limited precision. We bound the error term by 0.0000433 in 
Section [3 The results of Theorem 16.11 takes this error term into account. Our numerical 
calculations (i.e. our C program) appear in the Section T7.il 

6.1. Our algorithm for DamfcS. Our algorithm is based on a linear program (LP). We 
use the fractional LP solution in order to find a DamfcS solution with an average degree of 

6.1.1. The DamkS LP. Let G(V, E) be a given graph and let V = 1, . . . , n. Let iq be some 
vertex in V, and let 7 be a parameter. Our linear problem for Dam/cS follows: 



(6.3) min Vi 

(6.4) st : y i0 = 1 

(6.5) VieV Vi < x vh 

(6.6) Xij < yi 

(6.7) < yj 

(6.8) Vi Vl € [0, 1] 



In what follows, let d* be the average degree of the optimal solution to the Dam/cS 
problem. 

Lemma 6.2. There exists a vertex iq such that solving the DamkS LP with 7 = d*/2 yields 
an optimal solution of value at most k. 
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Proof. Consider the optimal solution U C V to the DamfcS problem on G. Let d* be the 
average degree of U. In Lemma 2 of pQ it is shown that U must include a subgraph U' of 
minimal induced degree at least d*/2. Let i$ be a vertex in U'. Setting yi = 1 for each 
vertex in {/' and otherwise, and setting Xij = 1 for each edge in J7 and otherwise, 
yields a valid solution to our LP of value \U'\ < k. This implies our assertion. □ 

In what follows we assume that we have guessed the correct value for iq and 7 as stated in 
the Lemma above. To assure this in practice we will run our algorithm (yet to be defined) 
for all possible values of iq (there are n such values) and for all values of 7 in the set 
{1, 2, 4, 8, . . . , n}, taking our algorithm to be the best out of these n log n executions. As we 
are approximating 7 within a factor of 2, we obtain a slight loss in our approximation ratio 
which (as we will see) can be bounded by a constant value of 4 (and is thus insignificant 
to our results). We also assume that 7 is at least n 0,01 , otherwise A\ yields an excellent 
approximation ratio. Finally, in what follows we refer to the variable yi as y$. 

The DamfcS LP gives weights y^s to the vertices. Next we present an algorithm that uses 
these weights in order to extract a subgraph with at most k vertices. 



6.1.2. The algorithm. We start with some definitions. Let Nq be the set consisting of a 
single vertex corresponding to yo in our LP. Refer to this vertex as vq. Let N\, N2 and 
N3 be the groups of vertices that are of distance one, two, and three from vq respectively. 
Here the distance between two vertices is the length of the shortest path connecting them. 
Notice that for every i ^ j: iVj n Nj = (f). 

Let dn be the maximum degree in the input graph. In our algorithm we consider two 
candidate subgraphs. 

The subgraph Si is constructed by randomly picking vertices i from iVo UJViU N2 ■ Each 
vertex Uj is picked independently with its corresponding probability j/j. The subgraph S2 
is constructed by randomly picking vertices V{ from N\ U N2 U N3, again each vertex is 
picked with its corresponding probability For our analysis, we assume that the random 
choices for Si and £2 are fresh (i.e. independent). We choose the densest between the two 
subgraphs. 

First we prove a technical lemma to be used later in our proof. 
Lemma 6.3. ^ n yf > 
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Proof. By the Cauchy Schwartz inequality we get 



£ * = £ < J £ 12 A / £ w 2 = > £ y* 2 

l,...,n l,...,n V l,...,n V l,...,n V l,...,n 

. If we square both sides and revise the equation, we get the lemma result. □ 

Lemma 6.4. Algorithm Aq yields an approximation ratio of r§ = {^r^j = @ {jfi 

Proof. Let us call Qo,Qi,Q2 and Q3 the sums of the yi values in Nq,N±,N2 and -/V3 respec- 
tively. By the LP feasibility of the y,'s and due to constraints 16. 4\ 16.51 16. b\ and 16.71 we 
conclude Q\ > 7. We calculate the average degree of S\. 

We start by bounding the expected number of edges in Si from below. To this end, we 
calculate the expected sum of degrees of vertices in Si n N\. Notice that each edge 
adjacent to N\ will appear in the subgraph induced by Si with probability yiyj. Thus: 

(6.9) y, £ yj) ^ £ ^ 

i£Ni ijeE ieNi 

Here we used Lemma 16.31 in the last inequality (and the fact that |iV"i| < djj). Thus we 

o 2 

have that the expected number of edges in Si is at least 2^-7 

We now turn to study the expected size of Si. This expectation is exactly Q0 + Q1 + Q2 
(which is at most k). Thus if both the number of edges in Si and its size behave as 
expected, we will obtain a subgraph Si of size < k with average degree d\ = <^ +Qi'+Q2 ' 
Now if Q2 < 2Qi then we obtain d\ = (j^j^J ^ © w ^^ cn ^ s an exce hent ratio. 

Otherwise, Q2 > 2Qi, and hence: 



It is left to show that with some polynomial probability it is indeed the case that both 
the size and the number of edges in Si behave as expected. The number of edges in Si, 
is bounded by n 2 and thus using Markov's inequality it holds that this number will be at 
least half its expectation with probability at least ^s- Regarding the number of vertices in 
Si, as these are chosen independently, using the Chernoff bound it holds that 

Pr[|Si| < 2(Q + Qi + Q 2 )} > 1 - e * 

As Q0 + Q1 +Q2 > Qi > 7 > n ' 01 (the latter by our assumption discussed at the beginning 
of this section), it holds that (for sufficiently large n) both the size of Si and the number 



2.3 



of edges it induces are within a factor of 2 from their expectation with probability at least 
„o.oi 

2^ e - 4^- 

We now address S 2 - As before the expected sum of degrees in the subgraph induced 
by S 2 is at least Eie7V 2 (f* Eije£ ^ Z)ieJV 2 (f* 7) > ^7- A gam we used Lemma E3 
for the last inequality (here, |AT 2 | < We divide the expected sum of degrees with the 
expected size of S 2 to get 

d lQ2 2 /d H 2 7 Q 2 2 
2 Qi + Q2 + Q3 d 2 H k ' 

In the last inequality we used the fact that k > LP > Qq + Q\ + Q2 + Q3 > Qi + Q2 + Q3 
(which also implies that in expectation IS2I < fc)- 

Again we show that the size and the number of edges in £2 behave as expected with 
some non- negligible probability. The number of edges in S2, is bounded by n 2 and thus 
using Markov's inequality it holds that this number will be at least half its expectation with 
probability ^r- Regarding the number of vertices in S2, as these are chosen independently, 
using the Chernoff bound it holds that 

Q1+Q2+Q3 

Pr[|5 2 | < 2(Qx + Q 2 + Q 3 )] > 1 - e s 

As Q\ + Q2 + Q3 > Q\ > 7 > n 0,01 , it holds that (for sufficiently large n) both the size of 

S2 and the number of edges it induces are within a factor of 2 from their expectation with 

_ n o.oi 

probability ^ - e~^r~ > 

Thus, all is all, with probability at least both the average degree of Si and S2 are 
at least Q(di) and ^(^2) respectively, and both S± and 5 2 are at most of size 2k. Reducing 
the size of Si and S2 (if needed) by Lemma 14,51 will not change the value of d\ or d 2 
significantly. Notice that when Q2 increases then so does cfe, while when Q2 decreases then 
d\ is increased. It now follows (by picking the worst value of (dnk'j 2 )^ for Q2) that we 
are guaranteed an average degree of @(^-^) 3 with the above probability. Repeating the 
above rounding procedure a polynomial number of times (say n 5 ) and taking the densest 
obtained subgraph we conclude that the above average degree is obtained with arbitrarily 
high (constant) probability. This suffices to prove our lemma. □ 



We have integrated the results of this section in our C program that calculates the final 
ratio of combining algorithms Ai, A2, A3, A4 and our A§. The resulting ratio is, r = 0.3159. 
Our detailed calculations appears at Section [71 
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7. Computing the final ratio using a C program 

Our C program runs on a grid of values for d* ,dfj and k and computes the resulting 
approximation ratio (as a function of n) for every point in this three dimensional grid. To 
be more specific, the set of values for d* (and also dn and k) has an exponentially growing 
nature and consists of the set {n % \ i = 0, A, 2A, . . . , 1} of size 1/A for a precision parameter 
A. The worst case setting of values for d* ,dn and k can (and probably is) located between 
the grid points and not necessarily on one of them. To compensate for this fact, we analyze 
the maximum loss in our ratio that may occur. Technically, as our C program computes 
the exponent of n in the resulting ratio, our loss can be computed using the linear nature 
of our equations defining the (logarithm of the) ratio of each of the algorithms Aj. For 
example, consider r 2 = ^jgr-- Neglecting the factor of 2 and setting d* = n 9 , du = n d , 
and k = n , we have that r2 = n 9_ +1 . Thus, the total loss in considering our grid of 
precision A in this case will be n 3A (a single factor of A for each one of g, k, and d). Let 
us compute our total error 5 err corresponding to the value of A we are using. 

The approximation ratio of Algorithm A\ is r±(k,d* ,dn) = d* . In the C-program 
this translates into n 9 , so 5\ = 1 • A. The approximation ratio of Algorithm Ai is 
r2(k,d* ,dn) = ^f - - I n the C-program this translates into n 9 ~ K ~ d+1 , so 62 < 3A. The 
approximation ratio of Algorithm ^3 is r^(k,d* ,dn) = 2max(k,dH)/d* . In the C-program 
this translates into n max ( K > d )-9 ; so £ 3 < 2 A. The approximation ratio of Algorithm A4 is 
r 4 (k,d*,d H ) = 2k \i"y^ ■ In the C-program this translates into n 2K+ ri d ~ 29 , so <5 4 < 4|A. 
When using Algorithm A§ in our program, algorithm ^5 is not needed (as it does not 
contribute to improving the approximation ratio), hence we don't have to consider £5. 

eft k 1 

The approximation ratio of Algorithm Aq is r§(k,d* ,dn) = ( (d*\i )^ ■ I n the C program 
it translates into ^6 < 3A. Our total error 5 error is thus bounded by 

max(8i,S 2 ,Ss,d^de) = 4±A 

Running our C program while considering only the algorithms A± , . . . , A5 of [5] with 
A = 0.00001, we receive the ratio r prev i ous = 0.32258. This suffices to prove Theorem 15.11 
as all we seek is a lower bound on the ratio derived from the stated values of n, . . . , r^. 

Running our C program with the algorithms of [S] and our additional algorithm Aq with 
A = 0.00001, we receive the ratio 0.315787. Adding the error 5 err — 0.0000433 results in 
the ratio of r new < 0.3159 stated in Theorem 16. 11 

7.1. Our C program for computing the approximation ratio. Our program appears 
in Figure [TJ 
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main() 
{ 

double g = 0; / / d* = ■nP 
double k = 0; // k = n K 
double d = 0; / / d H = n d 
double r = 0; // temp ratio 
double rr= 0; // final ratio 
double p = 0.00001; // step 
for (g = 0.0 ; g <= 1 ; g += p ) { 
for ( d = g ; d <= 1 ; d += p ) { 

for ( K = g ; K <= 1 ; K +=p ) { 

/ / due to algorithm A\ 

r = g - 0; 

/ / due to A2 

r = min( r , g-K-d+1 ); 

/ / due to A3 

r = min( r , g-2*g+max(K,d)); 
/ / due to A4 

r = min( r , g-3*g+2*K+d/3.0); 
/ / due to A$ 
if (2*d <= K) 

r = min(r,g- min((3*g-1.6*d-0.6*K),(5.0*g-K-2.0*d)/3.0 )); 
else if (( K < 2*d) && (K > d)) 

r = min(r,g - min(3*g-2*d-0.4*K,(5.0*g-4.0*d)/3.0)); 
/ / due to Aq our LP algorithm 
r = min(r,g-(7.0*g - 4.0*d - K)/3.0); 
if (rr < r) { 

rr=r; 

printf("d=%f K=%f g=%f rr=%f n" ,d,K,g,rr); 

} 

} 

} 

} 



Figure 1. Our C-program that computes the total ratio of all the sub 
algorithms combined. 



